Best Music Best You

Diya Peng(dp624) and Shenghua Li(3293)

Project Demo

Objective

Our goal is to build the raspberry pi with a camera to recognize the face before the camera, and then raspberry pi will query the database inside to find the favourite music style of these people, then the RPi will use the existing music of this style to generate a new music by deep learning.

Introduction

Building the raspberry pi with a camera to recognize the face before the camera
Using face recognition to identify different people
Using Natural Language Proessing to generate unique music for each person
Playing the music belonging to the person after recognize the face

Design

design steps of face recognition

Install camera with raspberry pi
Face detection and dataset collection
Training recognizer
Face recognition

Design steps of music generation

Collect and preprocess basic music data
Use Google Colab to train a NLP probability model based on LSTM
For one model, it can generate a new music of the same style
Train and generate new music by training new data

Design steps of playing music with specific face

When people stand in front of the camera of the raspberry pi, if this people’s face is on the collected dataset, this people’s face will be recognized and played the specific music, it achieves multiple users with multiple music.

Issus & Resolve

When setting up the OpenCV for the Python3, we find it not good to use one line to install OpenCV and it cannot find the module. Thus, we find the solution by updating apt-get and installing the related dependencies.

The basic model according to the reference is too hard to train: it uses 3-layer LSTM, thus each epoch needs 2 hours and it is unacceptable. However, according to my machine learning work experience, I decided that 2-layer LSTM is enough for training and some parameters can be configured (like optimizer being Adam and learning rate etc), then each epoch goes to 20 minutes with a still acceptable result.

Mplayer cannot play music in midi format, resolved by installing the ‘timidity’.

Testing

We tested three parts of our project, face recognition, music generation and combination of face recognition and music generation

Face recognition

The figure below shows the test result of face recognition, we can see it recognized Shenghua Li’s face with 55% confidence and Diya Peng’s face with 46% confidence.

Music Generation

The figure below shows the result of a generated music by learning from Schubert's piano music. As we can find in the figure, the chords generated are good and also sound nice.

Combination of face recognition and music generation

The figure below shows the test result of our project, the RPi could capture the face and then make judgment which music (generated by NLP by some music of the same style) to play.

Results & Conclusion

Our project reached our expectation.

Face recognition

The camera can successfully detect multiple people’s face and recognition their face.

Music generation

The NLP model can successfully train the model and generate new pieces of soundful music.

Combination of face recognition and music generation

When people come to the camera of the Raspberry Pi, the speaker would play the unique music generate for that person, our project can be appled for multiply users.

Future Work

We will enable pyGame screen on our project, so when people come to the camera of the Raspberry Pi, he can see his photo taken by the camera on the screen and the displayed music name would also be shown on the screen.

Budget

Vendor	Description	Quantity	Unit Cost($)	Total Cost($)
ECE Department	Raspberry Pi 3B	1	35.00	35.00
ECE Department	Raspberry Pi camera modele V2	1	25.00	25.00
ECE Department	JBL Speaker	1	25.00	25.00
			Total	85

References

Work Distribution

For our ‘best music best you’ project, diya implemented the face recognition part, shenghua implemented the music generation part, and we did the combination section together.

Code Appendix

Github Repo

ECE 5725: Embedded Operating Systems